YGGDRASIL-A statistical package for learning Split Models

نویسنده

  • Søren Højsgaard
چکیده

There are two main objectives of this paper. The first is to present a statistical framework for models with context specific indepen­ dence structures, i.e. conditional independen­ cies holding only for specific values of the con­ ditioning variables. This framework is consti­ tuted by the class of split models. Split mod­ els are an extension of graphical models for contingency tables and allow for a more so­ phisticated modelling than graphical models. The treatment of split models include estima­ tion, representation and a Markov property for reading off those independencies holding in a specific context. The second objective is to present a software package named YG­ GDRASIL which is designed for statistical in­ ference in split models, i.e. for learning such models on the basis of data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Yggdrasil: An Optimized System for Training Deep Decision Trees at Scale

Deep distributed decision trees and tree ensembles have grown in importance due to the need to model increasingly large datasets. However, PLANET, the standard distributed tree learning algorithm implemented in systems such as XGBOOST and Spark MLLIB, scales poorly as data dimensionality and tree depths grow. We present YGGDRASIL, a new distributed tree learning method that outperforms existing...

متن کامل

Splitting It Up: The spduration Split-Population Duration Regression Package for Time-Varying Covariates

We present an implementation of split-population duration regression in the spduration (Beger et al., 2017) package for R that allows for time-varying covariates. The statistical model accounts for units that are immune to a certain outcome and are not part of the duration process the researcher is primarily interested in. We provide insights for when immune units exist, that can significantly ...

متن کامل

Introduction Package CircOutlier For Detection of Outliers in Circular-Circular Regression

One of the most important problem in any statistical analysis is the existence of unexpected observations. Some observations are not a part of the study and are known as outliers. Studies have shown that the outliers affect to the performance of statistical standard methods in models and predictions. The point of this work is to provide a couple of statistical package in R software to identi...

متن کامل

Push-Button Verification of File Systems via Crash Refinement

The file system is an essential operating system component for persisting data on storage devices. Writing bug-free file systems is non-trivial, as they must correctly implement and maintain complex on-disk data structures even in the presence of system crashes and reorderings of disk operations. This paper presents Yggdrasil, a toolkit for writing file systems with push-button verification: Yg...

متن کامل

Boosting Algorithms: Regularization, Prediction and Model Fitting

We present a statistical perspective on boosting. Special emphasis is given to estimating potentially complex parametric or nonparametric models, including generalized linear and additive models as well as regression models for survival analysis. Concepts of degrees of freedom and corresponding Akaike or Bayesian information criteria, particularly useful for regularization and variable selectio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000